Integrating TELLTALE as the CARROT2 IR Engine
نویسندگان
چکیده
CARROT2 is a distributed information retrieval system. The backbone IR engine in CARROT2 is currently a piece of legacy software called Managing Gigabytes. It is the goal of this project to implement a different IR engine as CARROT2’s backbone -a piece of legacy software called TELLTALE. In this paper, we discuss some other legacy integrations, some of the concepts and ideas critical to legacy integration, and some tools created to help facilitate the integration of legacy systems into current day technologies.
منابع مشابه
Carrot2: Design of a Flexible and Efficient Web Information Retrieval Framework
In this paper we present the design goals and implementation outline of Carrot, an open source framework for rapid development of applications dealing with Web Information Retrieval and Web Mining. The framework has been written from scratch keeping in mind flexibility and efficiency of processing. We show two software architectures that meet the requirements of these two aspects and provide ev...
متن کاملCarrot2 and Language Properties in Web Search Results Clustering
This paper relates to a technique of improving results visualization in Web search engines known as search results clustering. We introduce an open extensible research system for examination and development of search results clustering algorithms – Carrot. We also discuss attempts to measuring quality of discovered clusters and demonstrate results of our experiments with quality assessment when...
متن کاملAn Algorithm for Clustering of Web Search Results
In this thesis we propose a description-oriented algorithm for clustering of results obtained from Web search engines called LINGO. The key idea of our method is to first discover meaningful cluster labels and then, based on the labels, determine the actual content of the groups. We show how the cluster label discovery can be accomplished with the use of the Latent Semantic Indexing technique. ...
متن کاملPerformance and Scalability of a Large-Scale N-gram Based Information Retrieval System
Information retrieval has become more and more important due to the rapid growth of all kinds of information. However, there are few suitable systems available. This paper presents a few approaches that enable large-scale information retrieval for the TELLTALE system. TELLTALE is a dynamic hypertext information retrieval environment. It provides full-text search for text corpora that may be gar...
متن کاملHySpirit - A Probabilistic Inference Engine for Hypermedia Retrieval in Large Databases
HySpirit is a retrieval engine for hypermedia retrieval integrating concepts from information retrieval (IR) and deductive databases. The logical view on IR models retrieval as uncertain inference, for which we use probabilistic reasoning. Since the expressiveness of classical IR models is not suucient for hypermedia retrieval, HySpirit is based on a probabilistic version of Datalog. In hyperme...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001